Parallelism Utilization in Embedded Reconfigurable Computing Systems: A Survey of Recent Trends
نویسندگان
چکیده
Recently, embedded reconfigurable computing has attracted great attention due to its potential to accelerate application execution. Its key feature is the ability to perform computations in hardware to increase performance, while retaining much of the flexibility of a software solution. Researchers in this field have reported substantial performance improvements for a variety of different applications like cryptography, multimedia processing, genetics, networking and DSP. Embedded reconfigurable computing systems can lend themselves to high performance computing by taking advantage of parallelism at different levels of granularity, ranging from fine grained instruction level to coarse grained process and/or task level parallelism. It is often necessary to use different parallel processing techniques to fully take advantage of these systems. In this survey, we explore recent enhancements to this new field of computing, considering the embedded reconfigurable hardware architectures and software facilities targeting these systems. The focus of the survey is on the employment of parallelism which seems to be a key feature in application development for embedded reconfigurable systems. More precisely, four different levels of parallelism indicated by Instruction Level, Data/Loop Level, Task Level, and Process/Thread Level, are introduced and distinguished by properties identified for each category. Various reconfigurable systems incorporating one or a combination of these attributes are investigated. Finally, we generally try to identify the major problems that limit the embedded reconfigurable computing systems from reaching their maximum potentials.
منابع مشابه
Implementation of VlSI Based Image Compression Approach on Reconfigurable Computing System - A Survey
Image data require huge amounts of disk space and large bandwidths for transmission. Hence, imagecompression is necessary to reduce the amount of data required to represent a digital image. Thereforean efficient technique for image compression is highly pushed to demand. Although, lots of compressiontechniques are available, but the technique which is faster, memory efficient and simple, surely...
متن کاملExploiting loop-level parallelism on coarse-grained reconfigurable architectures using modulo scheduling - Computers and Digital Techniques, IEE Proceedings-
Coarse-grained reconfigurable architectures have become increasingly important in recent years. Automatic design or compilation tools are essential to their success. A modulo scheduling algorithm to exploit loop-level parallelism for coarse-grained reconfigurable architectures is presented. This algorithm is a key part of a dynamically reconfigurable embedded systems compiler (DRESC). It is cap...
متن کاملEvolution in architectures and programming methodologies of coarse-grained reconfigurable computing
Technichal report IDE0713 Evolution in Architectures and Programming Methodologies of CoarseGrained Reconfigurable Computing Zain-ul-Abdin and Bertil Svensson In order to meet the increased computational demands of, e.g., multimedia applications, such as video processing in HDTV, and communication applications, such as baseband processing in telecommunication systems, the architectures of recon...
متن کاملAcceleration of Optical-Flow Extraction Using Dynamically Reconfigurable ALU Arrays
An effective way to implement image processing applications is to use embedded processors with dynamically reconfigurable accelerator cores. The processing speed of these processors are not only depends on the parallelism, but also depend on the local memory utilization since the local memories are much faster than the global memory. In this paper, we accelerate the optical-flow extraction algo...
متن کاملMT-ADRES: Multithreading on Coarse-Grained Reconfigurable Architecture
The coarse-grained reconfigurable architecture ADRES (Architecture for Dynamically Reconfigurable Embedded Systems) and its compiler offer high instruction-level parallelism (ILP) to applications by means of a sparsely interconnected array of functional units and register files. As high-ILP architectures achieve only low parallelism when executing partially sequential code segments, which is al...
متن کامل